12 research outputs found

    Massively parallel implicit equal-weights particle filter for ocean drift trajectory forecasting

    Get PDF
    Forecasting of ocean drift trajectories are important for many applications, including search and rescue operations, oil spill cleanup and iceberg risk mitigation. In an operational setting, forecasts of drift trajectories are produced based on computationally demanding forecasts of three-dimensional ocean currents. Herein, we investigate a complementary approach for shorter time scales by using the recently proposed two-stage implicit equal-weights particle filter applied to a simplified ocean model. To achieve this, we present a new algorithmic design for a data-assimilation system in which all components – including the model, model errors, and particle filter – take advantage of massively parallel compute architectures, such as graphical processing units. Faster computations can enable in-situ and ad-hoc model runs for emergency management, and larger ensembles for better uncertainty quantification. Using a challenging test case with near-realistic chaotic instabilities, we run data-assimilation experiments based on synthetic observations from drifting and moored buoys, and analyze the trajectory forecasts for the drifters. Our results show that even sparse drifter observations are sufficient to significantly improve short-term drift forecasts up to twelve hours. With equidistant moored buoys observing only 0.1% of the state space, the ensemble gives an accurate description of the true state after data assimilation followed by a high-quality probabilistic forecast

    Bias Correction of Operational Storm Surge Forecasts Using Neural Networks

    Full text link
    Storm surges can give rise to extreme floods in coastal areas. The Norwegian Meteorological Institute produces 120-hour regional operational storm surge forecasts along the coast of Norway based on the Regional Ocean Modeling System (ROMS), using a model setup called Nordic4-SS. Despite advances in the development of models and computational capabilities, forecast errors remain large enough to impact response measures and issued alerts, in particular, during the strongest events. Reducing these errors will positively impact the efficiency of the warning systems while minimizing efforts and resources spent on mitigation. Here, we investigate how forecasts can be improved with residual learning, i.e., training data-driven models to predict the residuals in forecasts from Nordic4-SS. A simple error mapping technique and a more sophisticated Neural Network (NN) method are tested. Using the NN residual correction method, the Root Mean Square Error in the Oslo Fjord is reduced by 36% for lead times of one hour and 9% for 24 hours. Therefore, the residual NN method is a promising direction for correcting storm surge forecasts, especially on short timescales. Moreover, it is well adapted to being deployed operationally, as i) the correction is applied on top of the existing model and requires no changes to it, ii) all predictors used for NN inference are already available operationally, iii) prediction by the NNs is very fast, typically a few seconds per station, and iv) the NN correction can be provided to a human expert who may inspect it, compare it with the model output, and see how much correction is brought by the NN, allowing to capitalize on human expertise as a quality validation of the NN output. While no changes to the hydrodynamic model are necessary to calibrate the neural networks, they are specific to a given model and must be recalibrated when the numerical models are updated

    Shallow Water Simulations on Graphics Hardware

    Get PDF
    Conservation laws describing one or more conserved quantities in time arise in a multitude of different scientific areas. Mathematically, conservation laws are expressed as partial differential equations (PDEs). In this thesis, the shallow water equations are the particular system of interest, and flood simulations the main application area. There are several numerical methods for approximating the solution of hyperbolic PDEs like the shallow water equations, and finite volume methods constitute an important class. Explicit finite volume methods typically rely on stencil computations, making them inherently parallel, and therefore a near perfect match for the many-core graphics processing unit (GPU) found on today’s graphics cards. The GPU is one example of the accelerators now used in high performance computing. Accelerators are typically power efficient, and deliver higher computational performance per dollar than traditional CPUs. Through the scientific papers in this thesis, we present efficient hardware-adapted shallow water simulations on the GPU, based on a high-resolution centralupwind scheme. The topics range from best practices for stencil computations on the GPU to adaptive mesh refinement. This work extends to architectures similar to the GPU and to other hyperbolic conservation laws

    Solving systems of hyperbolic PDEs using multiple GPUs

    Get PDF
    This thesis spans several research areas, where the main topics being parallel programming based on message-passing, general-purpose computation on graphics processing units (GPGPU), numerical simulations, and domain decomposition. The graphics processing unit (GPU) on modern graphics adapters is an inexpensive source of wast parallel computing power. To harvest this power, general purpose graphics programming is used. The main agenda of the thesis is to make a case for GPU clusters. Numerical simulations of hyperbolic conservation laws using explicit temporal difference methods (finite-difference methods (FDM), finite-volume methods (FVM) and modern high-resolution methods) are used as test-cases. The GPU cluster is proven to be usable, efficient and sufficiently accurate on the chosen test-cases. A white paper where the GPU cluster is used to perform PLU-factorizations of matrices is also included as an appendix

    GPU Computing with Python: Performance, Energy Efficiency and Usability

    Get PDF
    In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL); between GPU generations; and between low-end, mid-range, and high-end GPUs. Our findings showed that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments showed that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency

    Data Assimilation for Ocean Drift Trajectories Using Massive Ensembles and GPUs

    Get PDF
    In this work, we perform fully nonlinear data assimilation of ocean drift trajectories using multiple GPUs. We use an ensemble of up to 10000 members and the sequential importance resampling algorithm to assimilate observations of drift trajectories into the underlying shallow-water simulation model. Our results show an improved drift trajectory forecast using data assimilation for a complex and realistic simulation scenario, and the implementation exhibits good weak and strong scaling

    Massively parallel implicit equal-weights particle filter for ocean drift trajectory forecasting

    No full text
    Forecasting of ocean drift trajectories are important for many applications, including search and rescue operations, oil spill cleanup and iceberg risk mitigation. In an operational setting, forecasts of drift trajectories are produced based on computationally demanding forecasts of three-dimensional ocean currents. Herein, we investigate a complementary approach for shorter time scales by using the recently proposed two-stage implicit equal-weights particle filter applied to a simplified ocean model. To achieve this, we present a new algorithmic design for a data-assimilation system in which all components – including the model, model errors, and particle filter – take advantage of massively parallel compute architectures, such as graphical processing units. Faster computations can enable in-situ and ad-hoc model runs for emergency management, and larger ensembles for better uncertainty quantification. Using a challenging test case with near-realistic chaotic instabilities, we run data-assimilation experiments based on synthetic observations from drifting and moored buoys, and analyse the trajectory forecasts for the drifters. Our results show that even sparse drifter observations are sufficient to significantly improve short-term drift forecasts up to twelve hours. With equidistant moored buoys observing only 0.1% of the state space, the ensemble gives an accurate description of the true state after data assimilation followed by a high-quality probabilistic forecast

    GPU Computing with Python: Performance, Energy Efficiency and Usability

    Get PDF
    In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL); between GPU generations; and between low-end, mid-range, and high-end GPUs. Our findings showed that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments showed that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency

    Performance and Energy Efficiency of CUDA and OpenCL for GPU Computing using Python

    Get PDF
    In this work, we examine the performance and energy efficiency when using Python for developing HPC codes running on the GPU. We investigate the portability of performance and energy efficiency between CUDA and OpenCL; between GPU generations; and between low-end, mid-range and high-end GPUs. Our findings show that for some combinations of GPU and GPU code, there is a significant speedup for CUDA over OpenCL, but that this does not hold in general. Our experiments show that performance in general varies more between different GPUs, than between using CUDA and OpenCL. Finally, we show that tuning for performance is a good way of tuning for energy efficiency

    Evaluation of selected finite-difference and finite-volume approaches to rotational shallow-water flow

    No full text
    The shallow-water equations in a rotating frame of reference are important for capturing geophysical flows in the ocean. In this paper, we examine and compare two traditional finite-difference schemes and two modern finite-volume schemes for simulating these equations. We evaluate how well they capture the relevant physics for problems such as storm surge and drift trajectory modelling, and the schemes are put through a set of six test cases. The results are presented in a systematic manner through several tables, and we compare the qualitative and quantitative performance from a cost-benefit perspective. Of the four schemes, one of the traditional finitedifference schemes performs best in cases dominated by geostrophic balance, and one of the modern finite-volume schemes is superior for capturing gravity-driven motion. The traditional finite-difference schemes are significantly faster computationally than the modern finite-volume schemes
    corecore